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Abstract. Consider an autoregressive model with measurement error: we observe 
Zi = Xi + Si, where Xi is a stationary solution of the autoregressive equation X t — 
/go(Xj_i) + £j. The regression function fgo is known up to a finite dimensional param- 
eter 9°. The distributions of X and £i are unknown whereas the distribution of e is 
completely known. We want to estimate the parameter 9° by using the observations 
Zo, . . . , Z n . We propose an estimation procedure based on a modified least square cri- 
terion. This procedure provides an asymptotically normal estimator 9 of 6°, for a large 
class of regression functions and various noise distributions. 

Keywords: autoregressive model, Markov chain, mixing, deconvolution, semi-parametric model. 
AMS 2000 MSC: Primary 62J02, 62F12, Secondary 62G05, 62G20. 

1. Introduction 
We consider an autoregressive model with measurement error satisfying 



(1.1) 



Zi — Xi + £j, 



where one observes Zq, ■ ■ ■ , Z n and the random variables £i,Xi,Ei are unobserved. The regression 
function fgo is known up to a finite dimensional parameter 9°, belonging to the interior 0° of 
a compact set 6 C M. d . The centered innovations (£i)i>i and the errors (ej)j>o are independent 
and identically distributed (i.i.d.) random variables with finite variances Var(£i) = cr| and 
Var(eo) = o\. We assume that Eq admits a known density with respect to the Lebesgue measure, 
denoted by f £ . Furthermore we assume that the random variables Xq, (£i)i>i and (ej)j>o are 
independent. The distribution of £i is unknown and does not necessarily admit a density with 
respect to the Lebesgue measure. We assume that (Xj)j>o is strictly stationary, which means 
that the initial distribution of Xq is an invariant distribution for the transition kernel of the 
homogeneous Markov chain (Xj)j>o. 

Our aim is to estimate 9° for a large class of functions fg, whatever the known error distri- 
bution, and without the knowledge of the £j's distribution. The distribution of the innovations 
being unknown, this model belongs to the family of semi-parametric models. 
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Previously known results. Several authors have considered the case where the function fg 
is linear (in both and x), see e.g. Andersen and Deistler (?), Nowak (?), Chanda (?, ?), 
Staudenmayer and Buonaccorsi (?), and Costa et al. (?). We can note that, in this specific case, 
the model (jl.ip is also an ARMA model (see Section 14.1.11 for further details) . Consequently, 
all previously known estimation procedures for ARMA models can be applied here, without 
assuming that the error distribution is known. 

For a general regression function, the model (jl.lj) is a Hidden Markov Model with possibly 
a non compact continuous state space, and with unknown innovation distribution. When the 
innovation distribution is known up to a finite dimensional parameter, the model (11. ip is fully 
parametric and various results are already stated. Among others, the parameters can be esti- 
mated by maximum likelihood, and consistency, asymptotic normality and efficiency have been 
proved. For further references on estimation in fully parametric Hidden Markov Models, we 
refer for instance to Leroux (?), Bickel et al. (?), Jensen and Petersen (?), Douc and Matias 
(?), Douc et al. (?), Fuh (?), Genon-Catalot and Laredo (?), Na et. al. (?), and Douc et al. 



In this paper, we consider the case where the innovation distribution is unknown, and thus 
the model is not fully parametric. In this general context, there are few results. To our knowl- 
edge, the only paper which gives a consistent estimator is the paper by Comte and Taupin (?). 
These authors propose an estimation procedure based on a modified least squares minimization. 
They give an upper bound for the rate of convergence of their estimator, that depends on the 
smoothness of the regression function and on the smoothness of f £ . Those results are obtained 
by assuming that the distribution Px of Xq admits a density fx with respect to the Lebesgue 
measure and that the stationary Markov chain (Aj)j>o is absolutely regular (/3- mixing). The 
main drawback of their approach is that their estimation criterion is not explicit, hence the links 
between the convergence rate of their estimator and the smoothness of the regression function 
and of the error distribution are not explicit either. Consequently, Comte and Taupin (?) are 
able to prove that their estimator achieves the parametric rate only for very few couples of 
regression functions/error distribution. Lastly their dependency conditions are quite restrictive, 
and the assumption that X admits a density is not natural in this context. 

Our results. In this paper, we propose a new estimation procedure which provides a consistent 
estimator with a parametric rate of convergence in a very general context. Our approach is based 
on the new contrast function 



w is such that (wfg)* / f* and (wfg)*/f* are integrable, where <p* is the Fourier transform of a 
function ip. We estimate 6° by 6 = argmin^ge S n (9), where 



where Me(u) is the real part of u. Under general assumptions, we prove that the estimator 
defined 9 is consistent. Moreover, we give some conditions under which the parametric rate of 



(?)■ 




(1.2) 
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convergence as well as the asymptotic normality can be stated. Those results hold under weak 
dependency conditions as introduced in Dedecker and Prieur (?). 

This procedure is clearly simpler than that of Comte and Taupin (?). The resulting rate is 
more explicit and links directly the smoothness of the regression function to that of f e . Our 
new estimator is asymptotically Gaussian for a large class of regression functions, which is not 
the case in Comte and Taupin (?). 

The asymptotic properties of our estimator are illustrated through a simulation study. It 
confirms that our estimator performs well in various contexts, even in cases where the Markov 
chain (Xj)j>o is not /3-mixing (and not even irreducible), when the ratio signal to noise is small 
or large, for various sample sizes, and for different types of error distribution. Our estimator 
always better performs than the so-called naive estimator (built by replacing the non-observed 
X by Z in the usual least squares criterion) . Our estimation procedure depends on the choice of 
the weight function w. The influence of this weight function is also studied in the simulations. 

Finally, we propose a more general estimator when it is not possible to find a weight function 
w such that (wfg)*/f* and (wfg)*/f* are integrable. We establish a consistency result, and 
we give an upper bound for the quadratic risk, that relates the smoothness properties of the 
regression function to that of f e . These last results are proved under a- mixing conditions. 

The paper is organized as follows. In Section [2] we present our estimation procedure. The 
theoretical properties of the estimator are stated in Section [3l The simulations are presented in 
Section SJ In Section [5] we introduce a more general estimator and we describe its asymptotic 
behavior. The proofs are gathered in Appendix. 



2. Estimation procedure 

In order to define more rigorously the criterion presented in the introduction, we first give 
some preliminary notations and assumptions. 



2.1. Notations. Let 



II <P lli = / \v{ x )\dx, II V Il2 = / l f 2 {x)dx, and || if ||oo= sup |y(x)|. 

J J xeM. 

The convolution product of two square integrable functions p and q is denoted by p * q(z) 
f p(z — x)q(x)dx. The Fourier transform (p* of a function tp is defined by 



(p*(t) = J e ux <p(x)dx. 

For 9 G R d , let || 9 ||| 2 = Yl=i an d let 9 T be the transpose matrix of 9. 

For a map (9, u) i— > (po(u) from 6 x 1 to R, the first and second derivatives with respect to 9 
are denoted by 

*>?><•> = KJOL-V with rf>(.) = ^l for 3£ {V .., rf} 
and „«>,.) = (*&(■)) w with „g> t (.) = *jSti, for j, k e {l,...,* 
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From now, P, E and Var denote respectively the probability P e o p , the expected value Ego P 
and the variance Var^o p , when the underlying and unknown true parameters are 9° and Px- 

2.2. Assumptions. We consider three types of assumptions. 

• Smoothness and moment assumptions 

(Ai) On 0°, the function 9 \-t fg admits continuous derivatives with respect to 9 up to the 
order 3. 

(A2) On 0°, the quantity w{Xq){Z\ — fg(Xo)) 2 , and the absolute values of its derivatives 
with respect to 9 up to order 2 have a finite expectation. 

• Identifiability assumptions 

(111) The quantity Sgo P (6) = E[(fgo(X) — fg(X)) 2 w(X)] admits one unique minimum at 

e = e°. 

(112) For all 9 £ 0°, the matrix S$ D (9) = \ — 'f* - - | exists and the matrix 
v ' 8 u ,Px y ' \ d8 i d8 i 

1 l<i,j<d 



Sg> (0°) = 2E 



is positive definite. 



• Assumptions on f £ 

(Ni) The density f e belongs to L 2 (R) and for all i£l, f*(x) / 0. 



The assumption QNiP is quite usual when considering estimation in the convolution model. 
It ensures the existence of the estimation criterion. 

2.3. Definition of the estimator. As already mentioned in the introduction, the starting 
point of our estimation procedure is to construct an estimator of the least square contrast 

(2.3) S e o iPx (9) = E[(Zi - fe(X )) 2 w(X )], 

based on the observations (Zj) for % = 0, . . . , n. 

We consider the following condition: there exists a weight function w such that for all 9 € Q, 

(Ci) The functions (wfg) and (wfg) belong to Li(R), and the functions w*/f* , (fgw)*/f* , 

tf%w)* I f* e belong to Li(M). 



Remark 2.1. The first part of Condition \C\\) is not restrictive. The second part can be 
heuristically expressed as "one can find a weight function w such that wfg is smooth enough 
compared to f £ ". For a large number of regression functions, such a weight function can be 
easily exhibited. Some practical choices are discussed in the simulation study (Section^). 

If ((CT]) holds, the expectations E(w(X)), E(w(X)fg(X)) and E(w(X)f^(X)) can be easily 
estimated. Let us present the ideas of the estimation procedure. Let ip be such that tp and 
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<p*/fe belong to Li(R). For such a function, due to the independence between eq and Xq we 
have 

«M*b)l = E (i J We"'*'"*) = « (s / 
Hence, based on the observations Zq, • • • , Z n , ^[^(^o)] is estimated by 



1 



-dt. 



2tt~" 7 / 6 *(-t) 
We then propose to estimate Sqq Px {9) by the quantity «S n (0) defined by 

v2 



fc=l 



Zk-fe) w) (t)e-^-i 



/;(-*) 



which satisfies 



E(5 n (0)) = E[(Zi - / fl (X )) 2 u;(Xo)]. 



This criteria is minimum when 9 = 9° under the identifiability assumption (111). Using this 
empirical criterion we propose to estimate 9° by 



(2.5) 



9 = argminSVt(#)- 



3. Asymptotic properties 

In this section, we give some conditions under which our estimator is consistent and asymp- 
totically normal. 

3.1. Consistency of the estimator. The first result to mention is the consistency of our 
estimator. It holds under the following additional condition. 



(C2) The functions sup (fg 1 i'w)*/f l 



and sup 

eee 



[fefflwT "l 'ft belong to Li(R) for any 



ie{l,...,d}. 



This condition is similar to flCiD for the first derivatives of fg. Thus it is not more restrictive 
than (fCj). 



Theorem 3.1. Consider Model $1.1]) under the assumptions |Aip - (fA2p , ((Ili[ ), ((II2D , (|NiD 

and the conditions (fCi[ )- (fC2| ). Then 9 defined by \2. 5\) converges in probability to 9°. 
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3.2. -^/n-consistency and asymptotic normality. To state the asymptotic normality of our 
estimator, we need to introduce some additional conditions. 

d 2 



(C3) the functions sup (f^hw) //* 
any i,j E {!,... ,d}; 



and sup 



Ufa)) I ft 



belong to L] 



for 



d 3 (few) 
d9id0jd9 k 
i(R), for i,j,ke {!,•■■ 



(C4) the functions sup 

flee 



Ife 



and sup 

6»G6 



<ffa)) If 



belong to 



(C 5 ) The integrals J \t(f e ow)*(t)\dt and J \t(f e o f$ k w)* (t)\dt are finite, for k G {1, • • • ,d}. 

The asymptotic properties of 9, defined by (|2.5|) . are stated under two different dependency 
conditions, which are presented below. 

Definition 3.1. Let (Q,A, P) 6e a probability space. Let Y be a random variable with values in 
a Banach space (B, || • ||b). Denote by A K (B) t/ie set of k- Lips chitz functions, i.e. the functions 
f from (B, || ■ ||b) to M smc/i i/iai |/(x) — f(y)\ < re || x — y ||b- Le£ .A/f 6e a a -algebra of A. Let 
Py|X fre a conditional distribution ofY given Ai, Py £/ie distribution ofY, and B(M) the Borel 
a -algebra on (B, || • ||b). The dependence coefficients a and t are defined by 

a(M,a(Y)) = I sup E(\Fy\ M (A) - Py(A)|) , 



AeB(B) 



and ifE(\\Y\\ M )< 00, t(M,Y) 



E[ 



sup 



(/) 



"Y 



(/)l ■ 



Let X = (Xj)j>o 6e a strictly stationary Markov chain of real-valued random variables. On 
M 2 , we put the norm \\x\\^2 = (\xi\ + 1 372 1 ) /2 . For any integer k > 0, the coefficients ax(&) and 
T x,2(&) of the chain are defined by 



ax(k) 

and ifE(\X \) < 00, rx 2$) 



a(a(Xo),a(X fc )) 

sup{r(cr(Xo), (Xi^X^)),^ < «i < i 2 } • 



Coefficient a(M,a(Y)) is the usual strong mixing coefficient introduced by Rosenblatt (?). 
Coefficient r(Ai,Y) has been introduced by Dedecker and Prieur (?). In Section [A.2[ we recall 
some conditions on £0 and f g o under which the Markov chain (Xj)j>o is a-mixing or r-dependent 
and illustrate those conditions through some examples. 

First we state the asymptotic normality of 6 when the Markov chain (JQ) of Model (jl.ip is 
a-mixing. 

Theorem 3.2. Consider Model ([Qp under assumptions (fAi| ), |A~2p, (THi[ j, ([IT2P, |Ni ), and 



conditions ( Ci j-( C4 ). Let QiXil be the inverse cadlag of the tail function t — > P(|Xi| > t). 
Assume that 

Ez-ax(fe) 
/ Qf Xl| («)d«<oo. 
k>i Jo 
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Then 9 defined by \2. 5\) is a \fn- consistent estimator of 9° which satisfies 

Vn~{9-9°) A AA(0,£i), 



where the covariance matrix Si vis defined in equation IIB.5\) . 

Next, we give the corresponding result when the Markov chain (Xi) is r-dependent. 

Theorem 3.3. Consider Model il.l}) under assumptions (fAi[ ), (A2p, (fU-il ^ (III2P , (fNi[ j, and 

conditions ( Ci Let G(t) = t~ 1 K(Xfl X 2 >t ) , and let G~ l be the inverse cadlag of G. 
Assume that 

(3.7) ^G-Vx^rx^AO <oo. 

fc>0 

Then 9 defined by \2. 5\) is a y/n- consistent estimator of 9° which satisfies 



y/E(6-6°) ^ AA(0,S 



where the covariance matrix Xi is defined in equation liB.5\) . 

Remark 3.1. Let us give some conditions under which \3. 6]) or {3. 7| ) are verified. Assume that 
~E,(\Xq\ p ) < 00 for some p > 2. Then \3. 6\) is true provided that ^2 k>0 k 2 ^ p ~ 2 ^ ax.(k) < 00, and 
(g77p is true provided that Efc>o( T x,2(A:)) (p " 2)/p < 00. 

Note that those results do not require the Markov chain to be absolutely regular as it is the 
case in Comte and Taupin (?). Consequently they apply to autoregressive models with weaker 
dependency conditions. Beside the dependency conditions, our estimation procedure allows to 
achieve the parametric rate for a larger class of regression functions than in Comte and Taupin 
(?)■ 

The conditions under which Theorems 13.21 and 13.31 hold are similar, except Condition ( C5 ) 
which appears only in Theorem 13.31 This condition is just technical and not restrictive at all. 

The choice of the weight function w is crucial. Various weight functions can handle with 
Conditions C1IC5 The numerical properties of the resulting estimators will differ from one 
choice to another. This point is discussed on simulated data in the next section. 

4. Simulation study 

We investigate the properties of our estimator for different regression functions on simulated 
data. For each choice of regression function, we consider two error distributions: the Laplace 
distribution and the Gaussian distribution. When e\ has the Laplace distribution, its density 
and Fourier transform are 

(4.8) f £ {x) = — L= expf-^|x|Y and/ E *(x) 



Hence, e\ is centered with variance a 2 . 

When Ei is Gaussian, its density and Fourier transform are 



(4.9) f £ (x) = -±= exp (-£2), and f*(x) = exp(-a 2 x 2 /2). 
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Hence, E\ is centered with variance a^. 

For each of these error distributions, we consider the case of a linear regression function and 
of a Cauchy regression function. We start with the linear case. 

4.1. Linear regression function. We consider the model (11.11) with fe(x) = ax+b, where \a\ < 
1 and 9 = (a,b) T . In these simulations, we have chosen to illustrate the numerical properties 
of our estimator under the weakest of the dependency conditions, that is T-dependency. As it 
is recalled in Appendix IA.21 when fgo is linear with \a\ < 1, if £o has a density bounded from 
below in a neighborhood of the origin, then the Markov chain (Xj)j>o is a-mixing. When £o 
does not have a density, then the chain may not be a-mixing (and not even irreducible), but it 
is always r-dependent. 

Here, we consider the case where the innovation distribution is discrete, in such a way that 
the stationary Markov Chain is r-dependent but not a-mixing. We also consider two distinct 
values of 9q. For the first value, the stationary distribution of Xi is absolutely continuous with 
respect to the Lebesgue measure. For the second value, the stationary distribution is singular 
with respect to the Lebesgue measure. In both cases Theorem 13.31 applies, and the estimator 9 
is asymptotically normal. 

• Case A (absolutely continuous stationary distribution). We focus on the case where the true 
parameter is 6° = (1/2, 1/4) T , Xq is uniformly distributed over [0, 1], and (£i)i>i is a sequence 
of i.i.d. random variables, independent of Xq and such that P(£i = —1/4) = P(£i = 1/4) = 1/2. 
Then the Markov chain defined for i > by 

(4.10) x i = ^ + ~x i „ 1 + & 

is strictly stationary, the stationary distribution being the uniform distribution over [0,1], and 
consequently a\ o = 1/12. This chain is non-irreducible, and the dependency coefficients are 
such that ax(k) = 1/4 (see for instance Bradley (?), p. 180) and rx,2(k) = 0(2~ k ). Thus 
the Markov chain is not a-mixing, but it is r-dependent. For the simulation, we start with Xq 
uniformly distributed over [0, 1], so the simulated chain is stationary. 

• Case B (singular stationary distribution). We consider the case where the true parameter is 
9° = (1/3, 1/3) T , Xq is uniformly distributed over the Cantor set, and (Ci)i>l is a sequence of 
i.i.d. random variables, independent of Xq and such that P(£i = —1/3) = P(£i = 1/3) = 1/2. 
Then the Markov chain defined for i > by 

(4.11) x { = i + Iav! + & 

is strictly stationary, the stationary distribution being the uniform distribution over the Cantor 
set, and consequently a\ = 1/8. This chain is non-irreducible, and the dependency coefficients 
satisfy ax(fc) = 1/4 and Tx,2(k) = 0(3~ k ). Thus the Markov chain is not a-mixing, but is 
r-dependent. For the simulation, we start with Xq uniformly distributed over [0,1], and we 
consider that the chain is close to the stationary chain after 1000 iterations. We then set 
Xi = Aj + iooo- 



In these two cases, we can find a weight function w satisfying the conditions flCiD - flCsl ). We 



first give the detailed expression of the estimator for two choices of weight functions w. Then we 
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recall the classic estimator when X is directly observed, the ARMA estimator, and the so-called 
naive estimator. 

4.1.1. Expression of the estimator. We consider the two following weight functions w 
(4.12) w(x) = N(x) = exp{-x 2 /(4^)} and w(x) = SC{x) = ^- (2 * sm ^ 4 



2n V x 



These choices of weight ensure that Conditions (Ci)-(Cs ) hold and that the two estimators, 
denoted by On and 9sc respectively, converge to 9° with the parametric rate of convergence. 
There are two main differences between these two weight functions. First, N depends on the 
variance error a 2 . Hence the estimator should be adaptive to the noise level. On the contrary, 
it may be sensitive to very small error variance as it appears in the simulations (see Figure [TJ . 
Second, SC has strong smoothness properties since its Fourier transform is compactly supported. 
The two associated estimators are based on the calculation of S n (9), which can be written as 

i n 

Sn(9) = - y\{Z 2 k + b 2 - 2Z k b)I (Z k _ 1 ) + a 2 h{Z k _ x ) - 2a(Z k - b)h(Z k ^)}, 

k=l 

with 

D — iuZ 



(4.13) Ij(Z) = ^Re J (p jW ) 



du, 



where Pj(x) = x J for j = 0, 1, 2, w being either w = N or w = SC. With the above notations, 
9 = (a, b) T satisfies 

/ 4 14 x - Efc=i ZkhjZk-x) Y2=i h{Zk-i) ~ ELi ^fc^o(^fc-i) Efc=i h(Z k -i) 

ELi h(z k -i) ELi H z k-i) - ( ELi h{z k -i)f 

(4.15) b = S=i^o(^-i) _~fUh(Z k -i) 



J2k=l Io( Z k-l) Efc=i I o(^fc-l) 

We now compute Ij(Z) for j = 0,1,2 and the two weight functions. In the following we 
respectively denote Ij^{Z) and Ij t sc(Z) the previous integrals when the weight function is 
either w = N or w = SC. 

We start with w = N and give the details of the calculations for the two error distributions 
(Laplace and Gaussian), which are explicit. Then, with the weight function w = SC, we present 
the calculations, which are not explicit whatever the error distribution f e . 

• When w = N, Fourier calculations provide that 

N*(t) = V2^^/2oiexp(-o- 2 t 2 ) 
(Npi)*(t) = v / 2^v / 2^fexp(-cj^ 2 )(-2^t/i), 
(Np 2 )*(t) = -V2^^/2o^exp(-a 2 t 2 )(-2a 2 + ia^t 2 ). 
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It follows that 

1 f p-itZ 

I 0jN (Z) = —Re J V^^/^eM-^j^—^dt, 

\ r e~ itz 

hA Z ) = ^ Re J ^V^e^M-^et 2 ){-2a 2 £ t/i)j^-^dt, 

1 f p—itZ 

h, N (Z) = —RejV^V^eM-Vet 2 )fae-^tt 2 )j^dt. 
If f £ is the Laplace distribution (|4.8p . replacing /* by its expression we get 



hAZ) = e~ z2 /^ - °f^N{Z) = [5/4 - Z 2 /{8a 2 £ )] e~ z ^\ 

hAZ) = [7Z/4 - Z*/(8a 2 )} e~ z2 ^\hAZ) = [~a 2 + 9Z 2 /i - Z 4 /(8a 2 )] e -*V(**). 
If f £ is the Gaussian distribution ()4.9|) . replacing /* by its expression we obtain 
hAZ) = V2e~ z2 l^\ hAZ) = 2V2Ze~ z2 ^ 2 ^ and hA Z ) = V2(4Z 2 - 2al)e^/^\ 
Hence we deduce the expression of a at and 6 at by applying (|4.14j) and (I4,15h , 
•When w = SC, Fourier calculations provide that 

SC*(t) = 1[_ 4 _ 2] (t)(t 3 /6 + 2t 2 + 8t + 32/3) + I[_ 2 ,o] (t)(-t 3 /2 - 2t 2 + 16/3) 
+l [2i4 ](t)(-t 3 /6 + 2t 2 - 8t + 32/3) + l [0 ,2](i)(t 3 / 2 - 2 * 2 + 16/3) 
(SC Pl )*(t) = ^-SC*(t)/i a nd(SCp 2 )*(t) = ^SC*(t)/(i 2 ). 

The integrals Ij t sc(Z), defined for j = 0, 1, 2 by 

1 f e -itz 
(4-16) I hSC (Z) = ^Me J (SC Pj )\t)——dt, 

have no explicit form, whatever the error distribution f e . It has to be numerically computed, 
using the IFFT Matlab function. More precisely, we consider a finite Fourier series approxima- 
tion of (SCpj)* (t) j f* (t) whose Fourier transfom is calculated using IFFT Matlab function. The 
result is taken as an approximation of Ij^sc(Z)- Finally we deduce the expression of asc an d 
bsc by applying ()4.14p and (|4.15p . 

4.1.2. Comparison with classical estimators. We compare the two estimators On and Bsc with 
three classical estimators, the usual least square estimator when there is no observation noise, 
the ARM A estimator, and the so-called naive estimator. 

• Estimator without noise. In the case where E{ = 0, that is (Xq, . . . ,X n ) is observed without 
error, the parameters can be easily estimated by the usual least square estimators 

- "EtiM-i-E;ii*.Etl*-i j T Vf yl - '(fv ^ 
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• ARMA estimator. When the regression function is linear, the model may be written as 

Zi - aZj_i - b = & + Si - a£i-i . 
The auto-covariance function 7y of the stationary sequence Yi = & + £i — a£j_i is given by 

7y (0) = (l + a 2 )a 2 +cj|, 7y(l) = -acrf, and jy(k) = for k > 1. 
It follows that is an MA(1) process, which may be written as 

*i = »7i ~ PVi-li 

where rji is the innovation, and |/3| < 1 (note that |/3| / 1 because 7y(0) — 2|ty(1)| > 0). 
Moreover, one can give the explicit expression of /3 and a 2 in terms of a, <r| and a 2 . It follows 
that, if \a\ < 1, (Zj)j>o is the causal invertible ARMA(1,1) process 

(4.17) Zi-aZi-x = b + rji - (3r]i-i. 

Note that a ^ f3 except if a = 0. Hence, if \a\ < 1 and one can estimate the parameters 

(a, 6, /3) by maximizing the so-called Gaussian likelihood. These estimators are consistent and 
asymptotically Gaussian. Moreover they are efficient when both the innovations and the errors e 
are Gaussian (see Hannan (?) or Brockwell and Davis (?)). Note that this well-known approach 
does not require the knowledge of the error distribution, but of course it works only in the 
particular case where the regression function fg is linear. For the computation of the ARMA 
estimator we use the function arma from the R tseries package (see Trapletti and Hornik (?)). 
The resulting estimators are denoted by a arm a and b arma . 

• Naive estimator. The naive estimator is constructed by replacing the unobserved Xi by the 
observation Z{ in the expression of ax and bx ■ 

n - n Si ZiZi - 1 ~ Zi SLl Zi - 1 onri U ^}_(S^ 7 \_7; -(S^7 

Unaive — v^n r^9 /v-vrt rv \o aiiu v-naive — \ / , n i I "•naive \ / , 

n J2i=iZU-{Ei=i z i-i) 2 nK 7^i J nK 7^t 

Classical results show that Q naive is an asymptotically biased estimator of 9°, which is confirmed 
by the simulation study. 

4.1.3. Simulation results. For each error distribution, we simulate 100 samples with size n, 
n = 500, 5000 and 10000. We consider different values of a £ such that the ratio signal to noise 
s2n = a 2 /Var(X) is 0.5,1.5 or 3. The comparison of the five estimators is based on the bias, 
the Mean Squared Error (MSE), and the box plots. If 9(k) denotes the value of the estimation 
for the k-th. sample, the MSE is evaluated by the empirical mean over the 100 samples: 

1 100 

MSE(9) = —Y / (0(k)-9 ) 2 . 

k=l 

Results are presented in Figures [Q12] and Tables [TH 

The first thing to notice is that, not surprisingly, 9 na i V e presents a bias, whatever the values of 
n, s2n and the error distribution. The estimator 9x has the good expected properties (unbiased 
and small MSE), but it is based on the observation of the Xi's. The previously known estimator 
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0.5 - 



0.3 - 



FIGURE 1. Results for linear Case B and Gaussian error, with n = 5000 and 
<7g/Var(X) = 0.5. Box plots of the five estimators a arma , a^, asc, «x and a na i ve , 
from left to right, based on 100 replications. True value is 1/3 (horizontal line). 



0.5 



-0.5 



ahatarma 



ahatN 



ahatSC 



ahatX 



ahatnaive 



FIGURE 2. Results for linear Case B and Gaussian error, with n = 5000 and 
of /Var(X) = 6. Box plots of the five estimators a arma , a^, a-sc, «x and a na i ve , 
from left to right, based on 100 replications. True value is 1/3 (horizontal line). 
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ratio Estimator 



n 


s2n 




@arma 


MSE) 


6 N (MSE) 


SC (MSE) 


0x(MSE) 


Vnaive(Mj-C') 


1000 


0.5 


a 


0.487 


(0.008) 


0.459 


(0.020) 


0.489 


(0.002) 


0.493 


(0.001) 


0.328 (0.030) 






b 


0.257 


(0.002) 


0.262 


(0.002) 


0.255 


(0.001) 


0.253 


(0.001) 


0.336 (0.008) 




1.5 


a 


0.494 


(0.015) 


0.488 


(0.013) 


0.492 


(0.006) 


0.501 


(0.001) 


0.198 (0.092) 






b 


0.251 


(0.004) 


0.253 


(0.002) 


0.253 


(0.002) 


0.249 


(0.001) 


0.399 (0.023) 




3 


a 


0.461 


(0.044) 


0.502 


(0.029) 


0.503 


(0.026) 


0.493 


(0.001) 


0.121 (0.145) 






b 


0.270 


(0.012) 


0.249 


(0.001) 


0.249 


(0.001) 


0.253 


(0.001) 


0.440 (0.037) 


5000 


0.5 


a 


0.497 


(0.001) 


0.499 


(0.004) 


0.499 


(0.001) 


0.499 


(0.001) 


0.332 (0.028) 






b 


0.252 


(0.001) 


0.251 


(0.001) 


0.251 


(0.001) 


0.251 


(0.001) 


0.334 (0.007) 




1.5 


a 


0.498 


(0.003) 


0.508 


(0.003) 


0.503 


(0.002) 


0.499 


(0.001) 


n inn / r\ nm \ 
U.ly9 (0.U91 ) 






b 


0.250 


(0.001) 


0.247 


(0.001) 


0.248 


(0.001) 


0.250 


(0.001) 


0.399 (0.022) 




3 


a 


0.487 


(0.008) 


0.492 


(0.004) 


0.495 


(0.004) 


0.500 


(0.001) 


0.123 (0.143) 






b 


0.256 


(0.002) 


0.253 


(0.001) 


0.252 


(0.001) 


0.250 


(0.001) 


0.437 (0.035) 


10000 


0.5 


a 


0.496 


(0.001) 


0.501 


(0.002) 


0.500 


(0.001) 


0.499 


(0.001) 


0.334 (0.028) 






b 


0.252 


(0.001) 


0.250 


(0.001) 


0.250 


(0.001) 


0.250 


(0.001) 


0.333 (0.007) 




1.5 


a 


0.504 


(0.002) 


0.500 


(0.001) 


0.501 


(0.001) 


0.500 


(0.001) 


0.200 (0.090) 






b 


0.248 


(0.001) 


0.250 


(0.001) 


0.250 


(0.001) 


0.250 


(0.001) 


0.401 (0.023) 




3 


a. 


0.493 


(0.003) 


0.499 


(0.001) 


0.499 


(0.002) 


0.498 


(0.001) 


0.124 (0.142) 






b 


0.254 


(0.001) 


0.250 


(0.001) 


0.250 


(0.001) 


0.251 


(0.001) 


0.438 (0.036) 



TABLE 1 . Estimation results for Linear Case A, Laplace error. Mean estimated 
values of the five estimators arma , On, Osc-, &x and na i ve are presented for 
various values of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True values are 
a = 1/2, b° = 1/4. MSEs are given in brackets. 



Garma has good asymptotic properties. However its bias is often larger than the biases of On and 
OsCi except when s2n = 0.5 and e is Gaussian. 

We now consider the two estimators On and Osc- Recall that their construction requires 
the choice of w. Note first that, whatever the weight function w, the two estimators On and 
Osc present good convergence properties. Their biases and MSEs decrease when n increases. 
When compared one to another, we can see that their numerical behaviors are not the same. 
Namely for not too large s2n, Osc has a MSE smaller than On (see Figure [T] and Tables [QUI 
when s2n < 3). With large s2n, the estimator On seems to have better properties (see Figure 
[2] when s2n = 6). This is expected since iV depends on (j\ and is thus more sensitive to small 
values of a\. The error distribution seems to have a slight infuence on the MSEs of the two 
estimators. The MSEs are often smaller when f £ is the Laplace density. This may be related 
with the theoretical properties in density deconvolution. In that context it is well known that 
the rate of convergence is slower when f £ is the Gaussian density. The two estimators 0^ and 
Osc have comparable numerical behaviors in the two linear autoregressive models. Let us recall 
that in both cases, the simulated chain X are non-mixing but are r-dependent. In Case A, the 
stationary distribution of X is continuous whereas it is not the case in Case B. This explains 
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ratio Estimator 



n 


s2n 




@arma 


[MSE) 


9 N (MSE) 


dsc(MSE) 


6 X {MSE) 


9naive(MSE) 


1000 


0.5 


a 
b 


0.483 
0.259 


(0.006) 
(0.002) 


0.539 
0.243 


(0.039) 
(0.003) 


0.496 
0.253 


(0.002) 
(0.001) 


0.495 
0.253 


(0.001) 
(0.001) 


0.331 (0.030) 
0.336 (0.008) 




1.5 


a 
b 


0.497 
0.251 


(0.021) 
(0.005) 


0.516 
0.243 


(0.027) 
(0.005) 


0.507 
0.246 


(0.009) 
(0.002) 


0.499 
0.249 


(0.001) 
(0.001) 


0.200 (0.091) 
0.399 (0.023) 




3 


a 
b 


0.456 
0.272 


(0.031) 
(0.008) 


0.521 
0.244 


(0.082) 
(0.016) 


0.481 
0.260 


(0.030) 
(0.007) 


0.501 
0.250 


(0.001) 
(0.001) 


0.120 (0.145) 
0.441 (0.037) 


5000 


0.5 


a 
b 


0.497 
0.251 


(0.001) 
(0.001) 


0.492 
0.252 


(0.006) 
(0.001) 


0.499 
0.250 


(0.001) 
(0.001) 


0.498 
0.250 


(0.001) 
(0.001) 


0.333 (0.028) 
0.333 (0.007) 




1.5 


a 
b 


0.490 
0.254 


(0.002) 
(0.001) 


0.510 
0.245 


(0.006) 
(0.001) 


0.502 
0.248 


(0.001) 
(0.001) 


0.499 
0.250 


(0.001) 
(0.001) 


0.12U (0.090) 
0.399 (0.022) 




3 


a 
b 


0.471 
0.263 


(0.010) 
(0.002) 


0.512 
0.245 


(0.008) 
(0.002) 


0.503 
0.249 


(0.005) 
(0.001) 


0.498 
0.251 


(0.001) 
(0.001) 


0.124 (0.141) 
0.437 (0.035) 


10000 


0.5 


a 
b 


0.504 
0.249 


(0.006) 
(0.001) 


0.500 
0.250 


(0.003) 
(0.001) 


0.498 
0.251 


(0.001) 
(0.001) 


0.499 
0.251 


(0.001) 
(0.001) 


0.331 (0.028) 
0.335 (0.007) 




1.5 


a 
b 


0.495 
0.253 


(0.002) 
(0.001) 


0.501 
0.250 


(0.002) 
(0.001) 


0.499 
0.251 


(0.001) 
(0.001) 


0.501 
0.250 


(0.001) 
(0.001) 


0.200 (0.090) 
0.401 (0.023) 




3 


a 
b 


0.492 
0.254 


(0.004) 
(0.001) 


0.498 
0.251 


(0.004) 
(0.001) 


0.500 
0.251 


(0.003) 
(0.001) 


0.500 
0.250 


(0.001) 
(0.001) 


0.126 (0.140) 
0.437 (0.009) 



TABLE 2. Estimation results for Linear Case A, Gaussian error. Mean estimated 
values of the five estimators arma , 8jy, 6sc, &x and 6 'naive are presented for 
various values of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True values are 
a = 1/2, 6° = 1/4. MSEs are given in brackets. 



the relative bad properties of 8 arma in Case B. Indeed, due to its construction, this estimator 
is expected to have good properties when the stationary distribution of the Markov Chain is 
close to the Gaussian distribution. On the contrary our estimators have similar behavior in both 
cases. 



4.2. Cauchy regression model. We consider the model (jl.ip with fe(x) = 6/(l+x 2 ) = 9f(x). 
The true parameter is 8° = 1.5. For the law of £o we take £o ~ M(0, 0.01). In this case, an 
empirical study shows that a\ is about 0.1. Moreover ax(^) = 0(K k ) for some k s]0, 1[ and 
the Markov chain is a-mixing (see Appendix IA.2|) . For w suitably chosen, Theorem 13.21 applies 
and states that 8 is asymptotically normal. For the simulation, we start with Xq uniformly 
distributed over [0, 1], and we consider that the chain is close to the stationary chain after 1000 
iterations. We then set Xi = Aj+iooo- 

To our knowledge, the estimator 8 is the first consistent estimator in the literature for this 
regression function. We first detail the estimator for two choices of the weight function w. Then 
we recall the classic estimator when X is directly observed and the so-called naive estimator. 
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ratio Estimator 



n 


s2n 




@arma 


Mbbj) 


d N (MSE) 


SC (MSE) 


9 X (MSE) 


Vnaive(Mj-C') 


1000 


0.5 


a 


0.288 


(0.021) 


0.341 


(0.013) 


0.330 


(0.002) 


0.326 


(0.001) 


0.217 (0.015) 






b 


0.354 


(0.005) 


0.331 


(0.001) 


0.333 


(0.001) 


0.335 


(0.001) 


0.389 (0.004) 




1.5 


a 


0.298 


(0.050) 


0.332 


(0.009) 


0.335 


(0.007) 


0.330 


(0.001) 


0.136 (0.040) 






b 


0.349 


(0.012) 


0.331 


(0.002) 


0.329 


(0.002) 


0.335 


(0.001) 


0.429 (0.010) 




3 


a 


0.240 


(0.127) 


0.343 


(0.017) 


0.343 


(0.018) 


0.330 


(0.001) 


0.084 (0.063) 






b 


0.385 


(0.033) 


0.333 


(0.003) 


0.333 


(0.003) 


0.338 


(0.001) 


0.465 (0.018) 


5000 


0.5 


a 


0.333 


(0.004) 


0.335 


(0.003) 


0.335 


(0.001) 


0.333 


(0.001) 


0.223 (0.012) 






b 


0.333 


(0.001) 


0.332 


(0.001) 


0.332 


(0.001) 


0.334 


(0.001) 


0.388 (0.003) 




1.5 


a 


0.331 


(0.011) 


0.328 


(0.002) 


0.334 


(0.001) 


0.334 


(0.001) 


U.433 (0.U41 ) 






b 


0.334 


(0.003) 


0.334 


(0.001) 


0.329 


(0.001) 


0.332 


(0.001) 


0.132 (0.010) 




3 


a 


0.290 


(0.030) 


0.329 


(0.003) 


0.329 


(0.004) 


0.333 


(0.001) 


0.083 (0.063) 






b 


0.355 


(0.008) 


0.335 


(0.008) 


0.335 


(0.008) 


0.334 


(0.001) 


0.459 (0.016) 


10000 


0.5 


a 


0.337 


(0.002) 


0.335 


(0.002) 


0.334 


(0.001) 


0.334 


(0.001) 


0.222 (0.012) 






b 


0.331 


(0.001) 


0.332 


(0.001) 


0.332 


(0.001) 


0.332 


(0.001) 


0.388 (0.003) 




1.5 


a 


0.322 


(0.006) 


0.336 


(0.001) 


0.336 


(0.001) 


0.334 


(0.001) 


0.134 (0.040) 






b 


0.339 


(0.002) 


0.332 


(0.001) 


0.332 


(0.001) 


0.333 


(0.001) 


0.433 (0.010) 




3 


a 


0.329 


(0.010) 


0.336 


(0.002) 


0.336 


(0.002) 


0.334 


(0.001) 


0.083 (0.063) 






b 


0.335 


(0.002) 


0.332 


(0.001) 


0.332 


(0.001) 


0.332 


(0.001) 


0.457 (0.015) 



TABLE 3. Estimation results for Linear Case B, Laplace error. Mean estimated 
values of the five estimators 9 arma , On, Osc, @x and 9 na i ve are presented for 
various values of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True values are 
a = 1/3, b° = 1/3. MSEs are given in brackets. 



4.2.1. Expression of the estimator. We consider the two following weight functions: 
(4.18) N c (x) = (l + x 2 ) 2 exp{-x 2 /(4c7 2 )} and SC c (x) = (1 + x 2 ) 2 — ( 2 * sm(x ^ 4 



2tt V x 

with <j 2 the variance of e. This choice of w ensures that Conditions QCiP - flUsj ) hold and our 
method allows to achieve the parametric rate of convergence. As in the linear case, these two 
weight functions differ by their dependence on a 2 and their smoothness properties. The two 
associated estimators are based on the calculation of S n (6), which can be written as 



1 n 

where 



n 

k=l 



1 f e —iuZ y f e —iuZ 

I W (Z) = —Re J ( w )*( u )j-^du, I wf (Z) = —Re J ( w f)* (u) j^—^du 

1 f e —iuZ 

and I w p{Z) = —Me / (wf 2 )*(u)— -du. 

2vr J Jsi-u) 
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ratio Estimator 



n S 2n e arma {MSE) 9 N (MSE) 8 SC (MSE) 8 X (MSE) 9 nalve .(MSE) 



1000 


0.5 


a 
b 


0.327 
0.338 


(0.016) 
(0.004) 


0.349 
0.332 


(0.035) 
(0.002) 


0.330 
0.336 


(0.003) 
(0.001) 


0.326 
0.337 


(0.001) 
(0.001) 


0.218 (0.014) 
0.392 (0.004) 




1.5 


a 
b 


0.290 
0.353 


(0.061) 
(0.015) 


0.355 
0.324 


(0.021) 
(0.004) 


0.345 
0.328 


(0.008) 
(0.002) 


0.332 
0.333 


(0.001) 
(0.001) 


0.133 (0.041) 
0.432 (0.010) 




3 


a 
b 


0.234 
0.383 


(0.153) 
(0.040) 


0.329 
0.337 


(0.049) 
(0.010) 


0.329 
0.337 


(0.051) 
(0.010) 


0.326 
0.337 


(0.001) 
(0.001) 


0.077 (0.067) 
0.461 (0.017) 


5000 


0.5 


a 
b 


0.329 
0.335 


(0.004) 
(0.001) 


0.341 
0.332 


(0.005) 
(0.001) 


0.333 
0.334 


(0.001) 
(0.001) 


0.332 
0.334 


(0.001) 
(0.001) 


0.220 (0.013) 
0.399 (0.003) 




1.5 


a 
b 


0.329 
0.335 


(0.009) 
(0.002) 


0.331 
0.334 


(0.003) 
(0.001) 


0.332 
0.333 


(0.002) 
(0.001) 


0.333 
0.333 


(0.001) 
(0.001) 


0.132 (0.041) 
0.433 (0.010) 




3 


a 
b 


0.315 
0.343 


(0.022) 
(0.006) 


0.348 
0.327 


(0.008) 
(0.002) 


0.348 
0.328 


(0.008) 
(0.002) 


0.334 
0.332 


(0.001) 
(0.001) 


0.084 (0.062) 
0.459 (0.016) 


10000 


0.5 


a 
b 


0.330 
0.335 


(0.002) 
(0.001) 


0.333 
0.333 


(0.003) 
(0.001) 


0.333 
0.333 


(0.001) 
(0.001) 


0.332 
0.334 


(0.001) 
(0.001) 


0.221 (0.013) 
0.389 (0.003) 




1.5 


a 
b 


0.328 
0.336 


(0.006) 
(0.002) 


0.336 
0.333 


(0.002) 
(0.001) 


0.334 
0.334 


(0.001) 
(0.001) 


0.333 
0.334 


(0.001) 
(0.001) 


0.132 (0.041) 
0.435 (0.010) 




3 


a 
b 


0.312 
0.344 


(0.014) 
(0.003) 


0.334 
0.333 


(0.004) 
(0.001) 


0.334 
0.333 


(0.004) 
(0.001) 


0.333 
0.333 


(0.001) 
(0.001) 


0.083 (0.063) 
0.458 (0.016) 



TABLE 4. Estimation results for Linear Case B, Gaussian error. Mean estimated 
values of the five estimators ar ma, @N, &SC-, @x and Q naive are presented for 
various values of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True values are 
a = 1/3, b° = 1/3. MSEs are given in brackets. 



The estimator can be expressed as 

Ylk=i Zklwf(Zk-i) 



(4.19) 9 = 



Y2=i i w p( z k-i) 



In the following we denote by I wfjNc (Z), I w p Nc {Z), I wfjS c c (Z) and I w p SCc (Z) respectively, 
the previous integrals when the weight function is either w = iV c or w = SC C . In the same way 
we denote by 9n c and 9sc c the corresponding estimators of 6°. 

• When w = N c , Fourier calculations provide that 

(N c f)*(t) = V2^^exp(-cj 2 t 2 )(l + 2a 2 (l-2cj 2 t 2 )) 
and (NJ 2 )*{t) = V2^y^exp{-a 2 t 2 ). 

Now, we can calculate the integrals I w f^ c {Z) and I w p Nc (Z). 



ESTIMATION IN AUTOREGRESSIVE MODEL WITH MEASUREMENT ERROR 



17 



If f £ is the Laplace distribution (j4.8j) . replacing /* by its expression we obtain 

I wf>Nc (Z) = exp(-Z 2 /(4<7 £ 2 )) [Z A ~ l8Z 2 a 2 £ + Z 2 + 8a £ 4 - 10<x 2 ] /(8a 2 ), 

and/ w/2iiVc (Z) = exp(-Z 2 /(4 f T 2 ))[l + i(l-^ f )]. 

If / e is the Gaussian distribution (|4.9p . replacing /* by its expression we obtain 

I wf>No (Z) = V2e~ z2 l^\l - 2a 2 + 4Z 2 ), and I wP , Nc {Z) = y/Te^^. 

• When w = SC C , easy calculations show that 

Iwf,sc c i z ) = h,sc{Z) + h,sc{Z) and I w p SCc (Z) = I 0>SC (Z), 

where Io t sc(Z) and l2,sc(Z) are defined by (|4,16p . As explained before, the integrals Io t sc(Z) 
and l2,sc(Z) have no explicit form, whatever the error distributions, and are numerically ap- 
proximated via the IFFT function. 

4.2.2. Comparison with classical estimators. We compare our estimators with two classical es- 
timators, the usual least square estimator without observation noise, and the naive estimator. 

• Estimator without noise. When Ej = 0, that is (X , . . . ,X n ) is observed without errors, the 
parameter can be easily estimated by the usual least square estimator 

a _ Yli=l X if( X i-l) 

x £?=i/ 2 (^-i) ' 

• Naive estimator. The idea for the construction of the naive estimator is to replace the unob- 
served Xi by the observation Zi in the expression of Ox to get 

a _ Ya=i Zjf(Zj-l) 
"naive \-^n ro / rr \ • 

Classical results show that naive is an asymptotically biased estimator of 6°, which is confirmed 
by the simulation study. 

4.2.3. Simulations results. For each error distribution, we simulate 100 samples with size n, 
n = 500, 5000 and 10000. We consider different values of a £ such that the ratio signal to noise 
s2n = o- 2 /Var(X) is 0.5, 1.5 or 3. 

The comparison of the four estimators is based on the bias, the Mean Squared Error (MSE), 
and the box plots. The results are presented in Figure [3] and Tables EM 

The first thing to notice is that, not surprisingly, na i ve presents a bias, whatever the values 
of n, s2n and the errors distribution. Moreover it converges to (false) values which are different 
according to s2n (see Tables ©-([6])). 

The estimator 9x has the good expected properties (unbiased and small MSE), but it is based 
on the observation of the X{'s. 

We now compare our two estimators illustrating the influence of w, s2n and f e . Globally, 
whatever the weight function w, the two estimators present good convergence properties. Their 
biases and MSEs decrease when n increases. The MSEs of 0sc c increase when s2n increases. 
This is not the case for the MSE of 9^ c . This is probably due to the fact that the weight 
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ratio 






Estimator 




n 


s2n 


e N JMSE) 


e S c c (MSE) 


9 X (MSE) 


9naive(MSE) 


1000 


0.5 


1.5095 (0.0042) 


1.5024 (0.0006) 


1.5004 (0.0000) 


1.4333 (0.0050) 




1.5 


1.5006 (0.0021) 


1.5005 (0.0013) 


1.5002 (0.0000) 


1.3657 (0.0190) 




3 


1.5017 (0.0024) 


1.5005 (0.0024) 


1.5002 (0.0000) 


1.3267 (0.0314) 


5000 


0.5 


1.5045 (0.0008) 


1.5005 (0.0001) 


1.5003 (0.0000) 


1.4320 (0.0047) 




1.5 


1.5003 (0.0004) 


1.4994 (0.0003) 


1.4997 (0.0000) 


1.3647 (0.0185) 




3 


1.4989 (0.0005) 


1.4992 (0.0005) 


1.5000 (0.0000) 


1.3223 (0.0318) 


10000 


0.5 


1.5033 (0.0004) 


1.5002 (0.0001) 


1.5000 (0.0000) 


1.4315 (0.0047) 




1.5 


1.5000 (0.0002) 


1.5000 (0.0001) 


1.4998 (0.0000) 


1.3650 (0.0183) 




3 


1.4972 (0.0002) 


1.4970 (0.0002) 


1.4998 (0.0000) 


1.3222 (0.0317) 



TABLE 5. Estimation results for Cauchy, Laplace error. Mean estimated values 
of the four estimators 9n c , Osc c i @x and Q naive are presented for various values 
of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True value is 0° = 1.5. MSE are 
given in brackets. 





ratio 




Estimator 






n 


s2n 


f) Na (MSE) 


e SCc (MSE) 


9 X (MSE) 


9naive(M ' S E) 


1000 


0.5 
1.5 

3 


1.4979 (0.0027) 
1.4995 (0.0029) 
1.5080 (0.0049) 


1.4998 (0.0006) 
1.5001 (0.0015) 
1.5058 (0.0042) 


1.5000 (0.0000) 
1.5005 (0.0000) 
1.4997 (0.0000) 


1.4230 (0.0064) 
1.3336 (0.0287) 
1.2832 (0.0487) 


5000 


0.5 
1.5 

3 


1.5033 (0.0006) 
1.5011 (0.0004) 
1.4998 (0.0009) 


1.5011 (0.0001) 
1.5001 (0.0003) 
1.4996 (0.0008) 


1.4999 (0.0000) 
1.4999 (0.0000) 
1.5002 (0.0000) 


1.4250 (0.0057) 
1.3351 (0.0274) 
1.2767 (0.0501) 


10000 


0.5 
1.5 

3 


1.5017 (0.0003) 
1.5025 (0.0003) 
1.5016 (0.0004) 


1.4997 (0.0000) 
1.5027 (0.0002) 
1.5021 (0.0004) 


1.4996 (0.0000) 

1.5001 (0.0000) 

1.5002 (0.0000) 


1.4236 (0.0059) 
1.3375 (0.0265) 
1.2778 (0.0495) 



TABLE 6. Estimation results for Cauchy, Gaussian error. Mean estimated values 
of the four estimators 6n c , ^sc c -> ®x and 9 naive are presented for various values 
of n (1000, 5000 or 10000) and s2n (0.5, 1.5, 3). True value is 6° = 1.5. MSE are 
given in brackets. 



function chosen for the construction of 9n c depends on erf. This estimator is thus more adaptive 
to changes in s2n. 



5. A MORE GENERAL ESTIMATOR 

For a large number of regression functions, a weight function w such as the one involved in 
the definition of the estimator 6 can be easily exhibited. Nevertheless for some specific regres- 
sion functions, it seems not straightforward to find a weight function such that (wfg)*/ f* and 
( w fe)*/f* are integrable. We refer to Butucea and Taupin (?) for a more complete discussion on 
this subject. Therefore, we propose a generalization of this estimator to relax these conditions. 
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1.55 



1.5 



1.45 



1.4 



1.35 



1.3 



thetahatN 



thetahatSC 



thetahatX 



thetahatnaive 



FIGURE 3. Results for Cauchy and Gaussian error, with n = 5000 and 
a^/Var(X) = 1.5. Box plots of the four estimators #/v c , 6sc c , @x arid 6 naive-, 
from left to right, based on 100 replications. True value is 1.5 (horizontal line). 



5.1. Definition of the general estimator. The key idea for this construction is the following. 
We introduce a density deconvolution kernel K Ut c n defined via its Fourier transform G by 



(5.20) 



K, Cn (t) 



K*(t/C n ) __K* Cn (t) 



where K* is the Fourier transform of a kernel K and C n is a sequence which tends to infinity 
with n. The kernel K belongs to L 2 (R). Its Fourier transform K* is compactly supported and 
satisfies |1 — K*(t)\ < Iw>i. Then, for any integrable function $, one has lim n ^oo n^ 1 Ya=i ^* 
K n , Cn (Zi) = E($pf)). Hence we estimate E($(X)) by n~ l Y™ =1 $ * K nfin (Z l ) instead of 
n _1 Y2i=i which is not available. We then propose to estimate S e o Px {0) by 



1 n 

(5.21) S n {6) = -J2 Me " M 2w ) * K n,C n (Zi-l)] 



i n r 

-E Re /(^-/< 



' e {x)) w(x) K niCn (Zi-i - x)dx. 



Using this more general empirical criterion we propose to estimate 6° by 



(5.22) 



arg mmS n (6). 



Note that the general construction relies to a truncation of integrals in (]2.4p . Also note 
that this general construction still works under Conditions (Ci)-(Cs)- It suffices to chose 
K*{t/C n ) = I| t |< Cn with C n = +oo. 
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5.2. Asymptotic properties under general assumptions. This section presents the as- 



ymptotic properties of 9 denned by ()5.22j) under milder conditions than conditions flCiD-flCs 



when one cannot exhibit a weight function w ensuring that these conditions hold. In this context 
the estimator is still consistent, but with a rate which is not necessarily the parametric rate. 
For the sake of simplicity we only consider the case of a-mixing Markov chains. 
We assume that 

On 0°, the quantity w 2 (Xq)(Z\ — /g(Xo)) 4 and the absolute values of its derivatives 
with respect to 9 up to order 2 have a finite expectation. 

d 

S n (9) 



(A 



(A 4 ) 
(As) 



The quantity sup sup El sup 



d0j 



is finite. 



sup\wfg\, \w\ and sup\wf 9 \ belong to Li 
6»ee eee 



We say that a function tp 6 Li (R) satisfies (]5.23p if for a sequence C n we have 



(5.23) 



mm 

g=l,2 



r(K* Cn -i) 



I 2 , +n 1 min 

9 9=1,2 



ft 



o(l). 



Theorem 5.1. Under the assumptions ((IliD , ( II2D , |[Ni[ j, (fAi| j ( A3 ) - (A5D , let 9 be defined 
by ( f 5. 22\) with C n such that h5.23\) holds for w, wfg and wfg and their first derivatives with 



respect to 9. Assume that the sequence is a-mixing that is 



ax(fe) — > 0, as k — > 00. 

n— ¥00 n— >oo 

Then E(||# — 1 1 ?a ) = °(1)> as n ~ * 00 an d 9 is a consistent estimator of 9°. 

We now give upper bounds for the rates of convergence under two different types of assump- 
tions: 

(Ae) Xq admits a density fx with respect to the Lebesgue measure and there exist two 
constants Ci(/|o) and C^/go) such that || fgofx |||< Ci(fgo), an d 
II fgofx C 2 (fgo). 

(A T ) supE[f 2 {X )f £ (z - X )] and supE[/ e (z - X )] are finite. 

These two assumptions are mostly required for technical reasons. The following theorem still 
holds when Xq does not admit a density, under a slightly different moment assumption. 

Theorem 5.2. Suppose that the assumptions of Theorem \5.1\ hold. Assume moreover that the 
sequence (Xk)k>o is a-mixing with Ylk>l V a x(&) < o°; an d that, for all 9 6 0, the functions 
w, fgw and fgiv and their derivatives up to order 3 with respect to 9 satisfy \5. 23]) . 

1 ) Assume that the sequence Xq admits a density with respect to the Lebesgue measure and 
that Assumption {[Aq\ ) holds. Then 9 — 9° = O p (tffy with ip n = ||(<^n,i)||^2, ip^j = ^nj + Vn,j/n, 
j = 1 . . . , d, where 



B 



n j 



mm 



(b [1] . b [2] .\ 



and V, 



71. J' 



--mmW [1] V [2] 
mm <^ v n j , v n j 



!} 
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B 



r(i) 



and 



[<?] 



(1) v^O, 



/I 



+ 



[wfeof, 



K, 



( i, 



fe 



2) Assume that {Agp /io/ds. TTiera — U = O p (<p„) urai/i </?„ = ||(</£>n,j)||^ } = -B^+V^j/n, 
= 1 . . . , d, where B n j = B^l and V n j = min 



\v ll] ,vnx 



This theorem states an upper bound for the quadratic risk under very general conditions. It 
holds under mild conditions on w, fg and f £ . We refer to Table 1 in Butucea and Taupin (?) 
for more details on the resulting rates. 

Appendix A. Properties of the dependence coefficients and examples 

A.l. Covariance inequalities and coupling. The following results are the key arguments to 
prove the asymptotic normality of 6. We keep the same notations as in Definition 13.11 

We first recall a covariance inequality due to Rio (?). For any positive random variable Z, 
let Qz be the inverse cadlag of the tail function t — > ¥(Z > t). Let X and Y be two real valued 
random variables such that Cov(X, Y) is well defined. The following inequality holds 

ra(a(Y),a(X)) 



(A.24) 



\Cov(Y,X)\ < 4 [ 
Jo 



Q\x\( u )Q\Y\{u)du . 



Next, we recall the coupling properties of r (see Dedecker and Prieur (?)): enlarging £1 if 
necessary, there exists X* distributed as X and independent of Ai such that 

(A.25) t(M,X)=E(\\X-X*\\ b ). 

A. 2. Dependence properties of autoregressive models. We recall here the mixing prop- 
erties of the autoregressive models 

Xi = fe°{ X i-l) + £ii 

that have been described in particular in the papers by Mokkadem (?) and Ango-Nze (?). For 
instance, assume that 

• the law of £o h as a density /e such that > c > on a neighborhood of zero, and 
there exists S > 1 such that lE(|^"o| 5 ') < 00 • 

• /go is continuous and there exist R > 1 and p G]0, 1[ such that: for any \x\ > R, 
\feo( x )\ < PM- 

Then there exists a unique invariant probability measure, and the stationary Markov chain 
(Xj)j>o satisfies ax(fc) = 0(n k ) for any n 1[ and is a-mixing. 
Now if the second point is weakened to 

• /go is continuous and there exist R > 1 and 5 G]0, 1[ such that: for any |rc| > R, 
\f e o(x)\<\x\(l-\x\- 5 ). 
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Then there exists a unique invariant probability measure, and the stationary Markov chain 
(Xi)i>Q satisfies ax(&) = 0(k 1 ^ s / & ) and is a-mixing. 

Now, if we do not assume that £o has a density, then the chain may not be a-mixing (and not 
even irreducible). However, under appropriate assumptions on f s o, it is still possible to obtain 
upper bounds for the coefficient r. For instance assume that 

• there exists 5 > 1 such that lEd^ol 5 ") < oo- 

• \fe°i x ) ~ fe°(v)\ < P\ x ~ v\ for some P it- 
Then there exists a unique invariant probability measure, and the stationary Markov chain 
(Xi)i>o satisfies rx,2(&) = 0(p k ) and is r-dependent. Now if the second point is weakened to 

• there exist 5 in [0, 1[ and C in ]0, 1] such that \f'(t)\ < 1 - C(l + \t\)~ s almost 
everywhere. 

Then there exists a unique invariant probability measure, and for S > 1 + 5 the stationary 
Markov chain (Aj)j>o satisfies rx,2("-) = 0(n( s+1 ~ s ^ s ) and is r-dependent. 



Appendix B. proofs of Theorems 

B.l. Proof of Theorem 13. 1L The main point of the proof consists in showing the two following 
points 

L 1 

i) for any 9 in 0, S n (6) — > S s o p Y (9), with S d o p Y (9) admitting a unique minimum in 

ra— i>oo ' ' 

9 = 9°. 

ii) For uj2(n, p) defined as ^(n, p) = sup {\S n (9) — S n (6')\ : \\6 — 9'Wp < p} , there exists a 
sequence pk tending to 0, such that 

(B.l) K(u 2 (n,p k )) = 0(p k ). 

Let us start with the proof of i) by writing that 

1 » 1 f ({Z! - fefwY (t)e- uz ° 
Sn(e) = -^(Z k ,Z k _ 1 ), with *(Zi,Z ) = —Re / ^ — dt, 

n k= i J Je\ t ) 

that is seen as a function of a strictly stationary and ergodic sequence of random variables. By 
the ergodic theorem and Assumption QA 2 D we conclude that for any 9 G 0, 



S n {9) ^ E(i>(Z 1 ,Z )) = S g0}Px (e). 

ra— >oo ' 

It remains now to check that there exists a sequence p k tending to 0, such that (IB.lj) holds. 
This follows by the assumption ((Cj) and by writing that 



□ 



(B.2) sup \S n (9) - S n {9')\ < sup || 9 - 9' \\ P sup || s£\d) \\p 

\\6-0%2<P \\8-8'\\z2<P 0eO° 
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B.2. Proof of Theorem 13. 2L By using a Taylor expansion based on the smoothness properties 
of 9 i — y wfg and the consistency of 9, we obtain 

o = s£\e) = sP(e°) + s^\e°)(e-9°) + R n (e- e°), 

with R n defined by 



Rn = [\s^(9° + s(9- 6°)) - S^(9°)]ds 
Jo 



(B.3) 
This implies that 

(B.4) 9- 9° = -[sW(f) + R n ]- 1 S^(9°). 

Consequently, we have to check the three following points. 

i) v^ 1} (0°) ^ AA(0,S ,i); 

n— >oo 

ii) S^(9°) A S$ Px (9% 

n— too u < x 

hi) R n defined in (|B.3P satisfies R n — > 0. 

n— >oo 

Note that the covariance matrix Eo,i in i) satisfies So i = E/47T 2 , with E defined by the 
equation (|B.6P below. Consequently, according to ii) and iii), the covariance matrix Si satisfies 



(B.5) 



Si = t^O^, (e°))- 1 E(5g ) p (9°)y\ with S defined by flU}. 



Under Assumption (|C2[), 



nSV(9°] 



1 



We have thus to prove that 



e°,p x x 
Proof of i) 

8 



)w 



e=e° 



e -itZk-l 



dt. 



1 rt /* —itZk—1 

^= g Re y ( - 2( z fc - Mf ^y {t )^^dt A aa(o, s 0>1 ) 



We first use that E(5 n (0)) = Sgo iPx (9) and thus 1(5^ (0°)) = S$p (0°) = 0. Next we write 

n 

V%?«(0°) = v^5«(0°) -E[V^5«((9°)] = o-^E r fe 

7r *^ n fc=i 



with T k = -2W k i + 2W k 2 , and 



[f$w)*(t)j^-dt-E Z h Re J (fft w y(t)-—-£-dt 



W, 



k,2 



-itZu 



-dt-E 



[f^f^wYitf-r—dt 



m-t) 
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Let Aii = a(X ,Xi,Eo,Ei). According to Dedecker and Rio (?), n ^ 2 XTfc=i converges to a 
centered Gaussian vector with covariance matrix 

(B.6) S = Cov(T 1 ,T 1 ) + 2^Cov(T 1 ,T fc ), 

fc>i 

as soon as for any (p, q) in {1, • • • , d} X {1, • • • , d} 



(B.7) 



5^E|(ri)pE((r fe )g|Mi)| <oo. 



fc=3 



For any (p, q) in {1, • • ■ , d} X {1, • • ■ , <i} and any i, j G {1, 2}, we shall give an upper bound for 

®\(WiJj&((W kd ) q \Mi)\. 

We first notice that the sequence (e^, £k-i) is independent of M.\ V a(Xk, X k -i). It follows that 
for i,j G {1,2}, 

E\{W lti ) p E((W k>j ) q \Mi)\ =E\(W lji ) p E((W k , j ) q \M 1 )\ , 

with 



^dt-E 



(W, 



k,2)q 



[f e of$w)*(t)e- itx k-idt-E 



X kJ {f$w)*(t)e- ux ^dt 
{f e of^wY(t)e- ux ^dt 



Next, since V(x h _ u x k )\< r ( eo *i,x ,x 1 ) = P(x fc _ 1 ,Jf J ,)|a(Jf 1 )» we infer that 

E|(W M ) P E((W fc)i ) g |Mi)| =E|(Wi 1 i) p E((W' fc)i ),|Xi) 
Next we use that under Condition QC2D, 



|(Wi,i) p | < \Zi\ 
< \Zi\ 



1 



< ddZxI+EdZil)). 

In the same way we get that |(W r i 1 2)p| < C2. 

Now, since £\ is independent of X\, for j G {1, 2} 



(it + E< 




dt + E< 





(/£»*(*)- 



-itZn 



E 



(B.8) 

In the same way 
(B.9) 

Note that 



< CiE 

< CE 



dJfil+Ed^D) E((^,,) g |X 1 ) 



E 



(W 1 , 2 ) P E((^,,) (? |X 1 ) < CE E((W fe)j ) 9 |X 1 ) 



E 



(|Xl| + E(|X X |)) E((^ jl ) 9 |X 1 ) = Cov((|Xi| + E(|X 1 |))sign(E((# fcil ) g |X 1 )), (W M ), 
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Now, we use the covariance inequality (lA.24j) . Note first that 

(|Xi| +E(|X 1 |))sign(E((W M ) g |X 1 )) < \X^ +E(\X 1 \) 

and 

\(W lt i) q \ KDdXil+EQXx])). 
Since (Xj)j>o is a strictly stationary Markov chain, it is well known that 

(B.10) a(a{X 1 )MXk-l,Xk)) = a(a{X 1 ),*(Xk-l)) = <xx(k-2). 

Hence, applying (|A.24|h 

ra x (fe-2) 



E 



(Wi,i) P E((W M )jXi) 



< C 



Q ]Xi] (u)du. 



We conclude that 



^2 E I (Wi ) i)pM.((Wk,i) q \Mi)\ <CY, 



k>3 



k>3 



a x (fc-2) 



Qfx 1 \( u ) du - 



Finally, using similar arguments for the three quantities J2k>3 E I (Wi,2)pE((Wk,i)q\Mi) 
Ek>3 E \(Wil)p E (( W k,2)q\Mi)\ and Efc>3 E l(^i,2) P E((W fc)2 ) 9 |Mi)| we conclude that 



as soon as 



V^S^(9°) A AA(0,S/(4^ 2 )) 

n— >oo 

fc>i ^° 



Proof of ii) 



Under Condition (C3 ), for j, k = 1, • • ■ , d, 



c) 2 



-($«;) (<) 



and by applying the ergodic theorem we get that 
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□ 



□ 



Proof of Hi) 

Starting from (|B,3j) and (IB.llj) . the point iii) follows from the assumption ( C4 ) on the properties 
of the derivatives at order 3 of wfg and wfg. □ 
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B.3. Proof of Theorem 13.31 We follow the proof of Theorem 13.21 and keep the same notations. 
We have to check that the condition (|B.7j) holds. We start from the inequalities (IB. 8ft and (|B.9p . 
For clarity, let us write 

(W k>x ) q = (W k;1 ) q (X k ,X k - X ). 

Let ipM be the truncating function defined by iPm(x) = (x A M) V (— M). Applying (|A.25|) . let 
(X k , X k-1 ) be the random variable distributed as (X k , X k - X ) and independent of X\ such that 

\(\\X k - X* k \\ x + \\X k _ x - X* k _ x \\ x ) = r(a(X x ), {X k _ x ,X k )) < r x , 2 (k - 2) . 



Define the constants K x and K 2 by 



K x 



dt < oo , K 2 



dt < 00 . 



Clearly 

\X x E((W k MX k ,X k _ x )\X x )\ <M|E((H^ M ) 9 (X fe ,X fe _ 1 )|X 1 )|+K 2 |X 1 |l| Xl | >M (|X fe |+E(|X fc |)). 
Now, since is independent of Xi, one has that 

|E((w fc ,i),(x fc) x fc _i)|Xi)| = \E((Wh,i) q (x k , x k -i) - (w k , x ) q (n,n-i)\Xi)\ . 

By definition of (W k>x ) q (X k , X k _ x ), there exists a constant C such that 

< C(|Afc|l|x fc |>M + l^fcl 1 |X*|>A/)- 

Hence 

\E({W k , x ) q (X k ,X k - X )\X x )\ < |E((Wj M ),foMX*) J X fc _i) - (WjfcAC^C^fc). ^fc-i)!^!)! 

+ C(|Afc|l|x fc |>M + |Afc|l|x*|>A/) • 

Since ipM is 1-Lipschitz and bounded by M, and since x — > exp(itx) is |i|-Lipschitz and bounded 
by 1, under Condition QC5D, one has 



\{W k>1 ) q ^M{X k ),X k _ x ) - {W h , x ) q W> M (Xl),Xi_ x )\ < MK 2 \X k _ x - X* k _ x \ + K x \X k - X* k \ . 
It follows that 

\X x E((W k , x ) q (X k ,X k - X )\X x )\ < K 2 \X x \l lXll>M (\X k \+E(\X k \)) 

+ M 2 K 2 \X k _ x - X* k _ x \ + MK x \X k - Xl\) 
+ CM(\X k \l\ Xk \ >M + |X^|1| X *| >M ) . 

Using that 



3 -2, ^2, 



|Ai|l|X!|>M|Afc| < -zX x l\ Xl \ >M + ~X k l\ Xk \ 



>M > 



we infer from (|B.8P with j = 1 that there exists a positive constant such that 



E 



(W XiX ) p E((W kiX ) q \X x ) < K(L(M 2 ) + M(M + l)rx, 2 (fc - 2)) , 
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where L(t) = E(Xq1 X 2 >4 ). Let then G(t) = t~ l L(t), and let G~ l be the inverse cadlag of G. 
Choose then M 2 = G , " 1 (r x , 2 (fc - 2)). We obtain that 



E 



(W M ) p E((W M ) 9 |Xi) J < 2K(2G-\r^ 2 (k-2))T^ 2 (k-2)+^G- 1 (T^ 2 (k - 2))r x , 2 (fc-2)) 
It follows that 

J^E [|(W 1i1 ) p E((^ i1 ), 2 |X 1 )|] < oo as soon as ^ G~' (r Xi2 (k))T^ 2 (k) < oo . 



fc>3 fc>0 



Easier control holds for the other terms in (|B.8P and (|B.9p . Consequently ()B.7p holds as soon 
as (|3.7p holds, and the proof is complete. 



B.4. Proof of Theorem 15.11 The proof of the consistency under the assumptions of Theorem 
15. II is quite different from the proof of the consistency under Conditions ( Ci H C2 ) in Theorem 
13.11 This comes from the fact that S n (9) is now a triangular array of the form 



1~ I /• \{Z 1 -f e fwX{t)e-^K* Cn {t) 

Sn(9) = - V* n (Z k ,Z k ^) with * n (Zi,Z ) = —Me / ^ dt. 

n fr[ 2tt J /<?(-*) 

In this context we show that 

i) For all 9 in 6, E[(5„(0) - S e o :Px (9)) 2 ] = o(l) asn^co. 

ii) The control (jB.ip holds. 

Note first that ii) follows from the upper bound (jB.2[) and Assumption ( A4 ). 
For the proof of i) we check that for all 9 G 0, 

(B.12) E[S n (9)]- S g o tPx (9) = o(l) and Var(S n (#)) = o(l), as n -> 00. 

Proof of the first part of (ji?. U§ . Since Zo = Xo + £0) with £0 independent of (Zi, A"o), it follows 
that 

E[S n (0)] = E [Me ((Zi - fof w) * i^.cjZo)] = E [((Zi - / e ) 2 w) * i^c n (*o)] , 

hence 

' ' ' "° ' x ,2\ -1111 „*, 



£[^,(9)1-5,0^(9) = — / /(/^(x) + ^ + ^)e-* w «;*(«)(^ n -l)(«)d U Px(dx) 



2vr 

~yy / o(x)e-^(/H*w(^c„-i)w^x(^) 

+ i // e-^(/|^)*(n)(^-l)( U )Px(^)dn. 



2vr . 

Now, arguing as in Butucea and Taupin (?) we get that |E[5 n (0)] — S s o Px (9)\ 2 = o(l). 
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Proof of the second part of <\B. lty . Using that the Zj's are strictly stationary we get that 



Var [5V. 



Var 



-^^[((Zk- fe) 2 w)*K rhCn (Z k -i) 
k=i 

I 2 n 

< - Var (Ai, ) + Cov(A 1)0 , A Li A \ 



i=2 



< - Var (Ai, ) + - V | Cov(A li0 , A. 

fc=3 



with 



k,fc-l 



((Z fc -/ fl ) 2 ti;)* J Pf n)Cn (Z fc _ 1 ; 



Arguing as in Butucea and Taupin (?) we obtain that lim n _ > . 00 n 1 Var (A^q) = 0. It remains 
to study 



1 n 

\ Cov(A 1)0 ,A fc)fc _i)|. 



fc=3 



Lemma B.l. Let ^ such that E,(\^/(Z)\) < oo and let <3> be an integrable function. Let 

B fc , Jfe _i = R[e*(Z fc )**iir ni c n (^fc_i)]. 

ITien /or A; > 3 

Cov(S fc , fc _i,5i,o) = Cov[^(Z fe )$*^ ai (X fe _ 1 ),^(Z 1 )$*^ ai (X )] 

^jj?// <J>*(t)^*( S )Cov(^(Z fc )e~^-\^(Z 1 )e-^)^Jt)^ n ( S )^. 

Proof of Lemma \B.1[ By stationarity we write 

Cov(S fc(Jfe _ 1 ,Si, ) = nBk,k-iB lfi ) - E(B k:k _ 1 )E(B lfi ) = E(B k>k _ x B lfl ) - {E(B lfi )) 2 . 

Now, we use that the sequences (Xk)kez and (efc)fcgz are independent. This implies that (Z\,Xq) 
is independent of £o and thus 



E(Bi, ) = ^Me y **(t)E[*(Zi)e" 
In the same way, for k > 3, 



E(-Bfc,fc-i-Si,o) 
1 



E // **(«)$*(t)*(Z fc )*(Zi)Re(e 



-itz k _j ±y a 



m-t) 



3M 



-isZ{) Cj 



(27T) 

and the lemma is proved. 



□ 



ESTIMATION IN AUTOREGRESSIVE MODEL WITH MEASUREMENT ERROR 
It follows from Lemma lB.ll that for k > 3, 

9 

Cov(4fc )fc _i, Ai, ) = Cov[((Z fc - f e ) 2 w)*K Cn (Xk-i), {{Z\ - fefw) * K Cn (X j\ =J2 Ci 



i=l 



with 



1 



C 2 = -i // Cov(X Ae -^-SX ie -- x o)(^/ e )*(t)( w / e )*( s )^ n (t)^ n ( s )dtd s , 



C 3 = 7^2 //CovK^+^-^MX^^ 



(27T) 

C 4 = ^ // Cov(X fee -^-Se-^)(^r(t)(^/|)*( S )/vV„(.s)/v ( ■ (/),//,/*. 



° 6 = Wfjj C™\{Xl+el)e- itx *-\e- isX °]w%t){^ 

°r = 7A2 // Cov[ e -^-^(x 1 2 + e?)e- ^ ^]^( s )( U ;/|)*(^)^ n ( s )^ n (^) ( ^^ ( ^ s , 



(27T) 

° 8 = 2^ SI ' C ™[{Xt + el)e- itx *-\X x e- isX °]w%t){w 
C 9 = ^ fj 'cov[X fce -^-\(X 2 + e 2 ) e -^ 
Easy computations give 

Cov[(Xl + el)e- itx *-\{Xl + £ 2 ) e - x °] = 

Cov(X|e- i * Xfc - 1 , ^e"^) + <r 2 Cov(Xf e"^" 1 , e~ isXo ) 

+ ^Covte-^- 1 ,!^^) + a*Cov(e- UXk -\e- isXo ) , 

Cov[(X| + e^-^-Se-^"] = CovpTjge-*'**- 1 , e""* ) + of CovCe - "**- 1 , e""* ) , 

Cov[(X| + e 2 k )e- itXk -\X 1 e- isXo ] = Cov^e"****- 1 , X ie ~ isX °) + of Cov^-^" 1 , X ie - isX " 
which induces the decomposition Covf^^x, ^4.i,o) = Y^i=i ^i, with 

El = J^?SS Cov(e- itx ^,e- isX °)Kh n m*C n (s) 
x [Kf|)*(i)(™/ e 2 )*( S ^ 
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E'2 

E3 
E A 
E 5 
Ec, 
E 7 
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Kff 



IT 2 J J 



(2tt)s 



-1 



2vr 2 J J 
-1 

1 



(2vr) 2 
1 



Cov(X fc e- i *^-\X 1 e^ )( U ;/ e )*(0(^/ e )*(s)^ n (t)^ n (s)^ J 
Cov(X fc 2 e-^- , ^V^K (t)ti;* (t)^£ n (*)^, 

Cov{X k e- itx *-\e- isX °)K* Cn {s)K^ n ^ 

Cov{e- itx *-\X 1 e- isX °)K* Cn {s)KhS^ 
Cov(Xle- itx *-\e~ isX °)K* Cn (t)K* c ^ 

Cov(e~^-,X 1 2 e^)^ n (t)^ n ( S )^( S )( f T £ 2 u;*(t) + (wf$)*(t))dtds, 



E 8 = ^JJcov(Xle- itx ^-\X 1 e- isX °)w*(t)(wf e )%s)K* c ^^ 



E9 ~ ^JJ 
Using (|A.24p and (|B,10p . we have the upper bounds 



|Cov(e 



-itX k _ 1 -isXq- 



\Cov(X k e- itXk -\X ie - isXo ] 
\Cov{X 2 k e- itXk -\Xle- lsXo \ 



\Cov{Xle- itXk -\e- isXo 



\Cov{e- itXk -\Xle- isXo \ 
\Cov(Xle- itXk -\X 1 e- isX °\ 



\Cov(X k e- itXk -\ Xle~ isXo \ 



'0 



ax(fc-2) 







Qx(fe-l) 



< Cax(fe-l) 

r ax(fe-2) 

< c 

< c 

< c 

< c 

< c 

< c 



Q? x (u)du 



Q 4 



X 



Q 2 







■a x {k-2) 



Q 2 







I* 



a x (fc-2) 



Q 3 





r a x (fc-2) 



I* 



Q 3 



I a: 



(t)dt 
(t)dt 
(t)dt . 



Since K(Xf) < 00 and lim^oo ax(^) = 0, we infer that lim^oo | Covf-Aj^fc-i, ^1,0) I = 0. Now, 
by Cesaro's mean convergence theorem 



1 ™ 

lim - V| Cov(i4i )0 ,i4jfc,fc_i)| =0. 



fc=3 



This completes the proof of the consistency. 
B.5. Proof of Theorem [Ol 
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Proof of 1) in Theorem \5.2l Starting from the decomposition (IB.4jl we shall check the three 
following points. 

i) e \(s£\e°) - s^ Px (e°))(s^(9°) - s^ p je°)) T ] = o\^l\ 



h) s®«?) A s$ Px (e% 



IP 



hi) R n defined in (|B.3|) satisfies R n 0. 
The rate of convergence of 6 is thus given by the order of 
E 



(s£)(0°) - s$ Px (e ))(sP(9°) - s^ Px (e°)) r 



Proof of i) 



We first write 







(5W(0)). = i^^Me[((Z fc -/ e ) 2 u;)*^ n ,c„(^-i)-E[(Z fc -/ e (X fc _ 1 ))M^-i, 



k=l 



fe=l 



E 



_d_ 



(z k - MXk^fwiXk-i] 



Study of the bias. As in Butucea and Taupin (?), we get that 



E 



< Ci(f 9 o,w,f £ )mm 



B [1] .B [2] . 



for B-fj, q = 1,2, defined in Theorem 15,21 

Study of the variance. For the variance term, note first that 

Var ({S^(e ))) < -Var(£> 1>0 ) + - £ |Covpi,o, Dk,k-l) 



k=3 



with 



= Re(( - 2Z fc / fl ( 1 J. + 2/„/ ( gJ 7 >) *# nAl (Z fc _i). 
The first part in Var [(5'n 1 ^(0 o ))j] is controlled as in Butucea and Taupin (?) by 



n 



(B.13) 

with VI ] , o = 1,2 defined in Theorem 15.21 We now control the term 



1 ™ 

-Y'lCovCDi.o.Dfc.fc-i)!. 



fc=3 



Applying again Lemma IB. 11 we obtain that 

Cov(Di, 0) Djuk-i) = F x + F 2 + F 3 + F 4 
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with 

Fx = ^tejj Co^(X k e- itx ^-\X l e- lsX %f^ j w)\^ 

F 2 = ^eJJ Cov(e^*-Se-^°)(/^/^ 

F 3 = =±Re jj Cov{X k e- ux *-\e-™ x %f { ^w)\t^ 



f a = -±we II Cov( e -^-^x le -^)(/^>)*(^)(4 i ^)*( s )^ n (^)^ n ( s ) ( ^^ ( ^ 



7T 

Using (|A.24|) and (|B.10p we have the upper bounds 

\Cov(e- itXk -\e isXo )\ < Ca*(k-1 

fax(fc— 2) 

i 

cox(H) 
) 

ca x (fc-2) 



iCov^e-^-SXie^ )! < C / gf x ,(«)dti 

./o 
/•ox(fc-l) 

iCovCXfce^-Se*^ )! < C / 

•/ o 

/•ax(fc-2) 

IGovCe^-S^e^ )! < C / 

./o 



Since E(X^) < oo, we infer that Q\x\( u ) — Cu 1 ^ 4 , and consequently all the covariance terms 
are 0(y / ax(fc)). Finally, if T^ fc>0 v'ax(^) < oo, then 



i r 

-y^Cov(Di i0) Dfc,fc-i)|<-. 

k=3 

This, together with (|B.13|) . implies that 

Var Us^He ))} < -min{yj>^uj> )}. 
l j j n J J 

Proof of ii) 

The proof of ii) starts from the expression of the second derivative of the estimation criterion 
(B.14) 

1 A r ( d 2 d 2 , \ * KZ (t)e~ itz ^ 



Following the same lines as for the consistency we prove that 

S* 2) (*°) A 5^(61°). 



□ 

Proof of Hi) 
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The proof of iii) follows from (|B.14[) . from the smoothness properties of wfg and from Assump- 
tion QA 4 D . 

□ 

Proof of 2) in Theorem \5.2[ The proof of 2) in theorem 15.21 is quite similar to the proof of 1). 

The main differences appear in the control of the bias and variance of Sn\9 ). More precisely, 
we start from 



k=l 



E 



8 



de 

Study of the bias Since P z ,x(z,z) = P x {x)f £ {z-x) we obtain that R[SX'(9 )] - S^ Px (# u ) 



is equal to 



2E 



MXoXf^w) *K Cn (Xo) ~ fgo(X )f$\x )w(X ) 



+ 2E 



that is E[5i 1} (0 )] - S$ Px (9°) is equal to 



(l) 



f e o(x)e- iux (f$wy(u)(K* Cn (u) - \)P x {dx)du 



+ 2Re|y e- iux (f e0 fff W )*(u)(K* Cn (u)-l)P x (dx)du. 



It follows that for j = 1, ■ ■ ■ ,d, 



<E\f o(X o )\ / \(fV jW y(u)(K* Cn (u)-l)\du+ j \(f e of^wy(u)(K* Cn (u)-l)\du. 



Study of the variance For the study of the variance we combine the proof in Butucea and 
Taupin (?) and the proof of 1) of Theorem 15. 2[ For these reasons we only give a sketch of the 
proof, with details only for specific parts. As for the proof of 1) we start from 



Var^fl )), 



— Var 

n 



d[-2Z k f e w + f 2 e w] 
89, 



\e=e° 



* Kn,C n ( Z k-l) 



+ — ^2 Cm ( D k,k-i,D jt j-i), 

l<j<k<n 

with Dk^-i defined in (IB.5D . The control of (2/ra 2 ) Yli<j<k<n Cov(D/ c> fc_i, Dj j-i) is done as in 
the proof of 1). We now control the first part of Var((£^(0°))j). 



Var[(^ 1) (^°)) i ] < ^-MeE 



(7 



d[-2Zifew + fjw\ 

86; 



\e=e° 



* K n ,c n ( z i 
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In other words, 



C. 



Var (9%) < -ReE [[Zifffw + feofffwj * K n , Cn 
C 



leE 



■n 



Now, write that 



ReE 



Ih+II 2 , 



with 



//, ]p;r // / e (z-s)(/^,(x) + o|) ( I (f$w)(u)K ntCn (z-u)du) P x (dx)dz 

2 



Ih 



j ll f £ {z-x)[ I {f g of$w){u)K n £ n { z ~ u )du) P x {dx)dz. 



We apply Holder Inequality and obtain that 

II/xl < supE[(/ 2 (X ) + a })f E {z - X )\ \\ (f$ w ) * K n>Cn |||, 

ZGR 

and that is also less than 

E[(/, 2 (X ) + a|)] || (/JJ'^^nA IlL 

In the same way we have 

\II 2 \ < su P E[/ e (z - X )} || (foffiw)*^^ 111, and I/ 2 <|| (fof$w) *K n , Cn f x 



Consequently we have 
(B.15)Var[(fl£ 1 >(0°))i] < 



C(a|,/ e o,/ £ ) 



7? 



and 



(f e o } w)*K n , Cn \\l + || (f e of K e x J w)*K n>Cn Mi 

^ _ l|2 , II 



(B.16) Var[(S« (<?%•] < [|| (/,\>) * i^.Cn Hi + II (/ fl o/ fl \» * K n , Cn m 

By combining (|B.15P and (|B. 16[) . we get that 

Var[(5W(0)i] < ^ °*' ^ ^{^(g ), ^(g )} 



with Vj^j, q = 1, 2 defined in Theorem 15.21 



□ 
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